26 research outputs found
TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification
This paper describes the participation of the team "TwiSE" in the SemEval
2016 challenge. Specifically, we participated in Task 4, namely "Sentiment
Analysis in Twitter" for which we implemented sentiment classification systems
for subtasks A, B, C and D. Our approach consists of two steps. In the first
step, we generate and validate diverse feature sets for twitter sentiment
evaluation, inspired by the work of participants of previous editions of such
challenges. In the second step, we focus on the optimization of the evaluation
measures of the different subtasks. To this end, we examine different learning
strategies by validating them on the data provided by the task organisers. For
our final submissions we used an ensemble learning approach (stacked
generalization) for Subtask A and single linear models for the rest of the
subtasks. In the official leaderboard we were ranked 9/35, 8/19, 1/11 and 2/14
for subtasks A, B, C and D respectively.\footnote{We make the code available
for research purposes at
\url{https://github.com/balikasg/SemEval2016-Twitter\_Sentiment\_Evaluation}.
Multitask Learning for Fine-Grained Twitter Sentiment Analysis
Traditional sentiment analysis approaches tackle problems like ternary
(3-category) and fine-grained (5-category) classification by learning the tasks
separately. We argue that such classification tasks are correlated and we
propose a multitask approach based on a recurrent neural network that benefits
by jointly learning them. Our study demonstrates the potential of multitask
models on this type of problems and improves the state-of-the-art results in
the fine-grained sentiment classification problem.Comment: International ACM SIGIR Conference on Research and Development in
Information Retrieval 201
On a Topic Model for Sentences
Probabilistic topic models are generative models that describe the content of
documents by discovering the latent topics underlying them. However, the
structure of the textual input, and for instance the grouping of words in
coherent text spans such as sentences, contains much information which is
generally lost with these models. In this paper, we propose sentenceLDA, an
extension of LDA whose goal is to overcome this limitation by incorporating the
structure of the text in the generative and inference processes. We illustrate
the advantages of sentenceLDA by comparing it with LDA using both intrinsic
(perplexity) and extrinsic (text classification) evaluation tasks on different
text collections
Results of the BioASQ tasks of the Question Answering Lab at CLEF 2015
International audienceThe goal of the BioASQ challenge is to push research towards highly precise biomedical information access systems. We aim to promote systems and approaches that are able to deal with the whole diversity of the Web, especially for, but not restricted to, the context of bio-medicine. The third challenge consisted of two tasks: semantic indexing and question answering.59 systems by 18 different teams participated in the semantic indexing task (Task 3a).The question answering task was further subdivided into two phases. 24 systems from 9 different teams participates in the annotation phase (Task 3b-phase A), while 26 systems of 10 different teams participated in the answer generation phase (Task 3b-phase B).Overall, the best systems were able to outperform the strong baselines provided by the organizers.In this paper, we present the data used during the challenge as well as the technologies which were used by the participants